Goto

Collaborating Authors

 facebook post


HebID: Detecting Social Identities in Hebrew-language Political Text

arXiv.org Artificial Intelligence

Political language is deeply intertwined with social identities. While social identities are often shaped by specific cultural contexts and expressed through particular uses of language, existing datasets for group and identity detection are predominantly English-centric, single-label and focus on coarse identity categories. We introduce HebID, the first multilabel Hebrew corpus for social identity detection: 5,536 sentences from Israeli politicians' Facebook posts (Dec 2018-Apr 2021), manually annotated for twelve nuanced social identities (e.g. Rightist, Ultra-Orthodox, Socially-oriented) grounded by survey data. We benchmark multilabel and single-label encoders alongside 2B-9B-parameter generative LLMs, finding that Hebrew-tuned LLMs provide the best results (macro-$F_1$ = 0.74). We apply our classifier to politicians' Facebook posts and parliamentary speeches, evaluating differences in popularity, temporal trends, clustering patterns, and gender-related variations in identity expression. We utilize identity choices from a national public survey, enabling a comparison between identities portrayed in elite discourse and the public's identity priorities. HebID provides a comprehensive foundation for studying social identities in Hebrew and can serve as a model for similar research in other non-English political contexts.


The Enemy from Within: A Study of Political Delegitimization Discourse in Israeli Political Speech

arXiv.org Artificial Intelligence

We present the first large-scale computational study of political delegitimization discourse (PDD), defined as symbolic attacks on the normative validity of political entities. We curate and manually annotate a novel Hebrew-language corpus of 10,410 sentences drawn from Knesset speeches (1993-2023), Facebook posts (2018-2021), and leading news outlets, of which 1,812 instances (17.4\%) exhibit PDD and 642 carry additional annotations for intensity, incivility, target type, and affective framing. We introduce a two-stage classification pipeline combining finetuned encoder models and decoder LLMs. Our best model (DictaLM 2.0) attains an F$_1$ of 0.74 for binary PDD detection and a macro-F$_1$ of 0.67 for classification of delegitimization characteristics. Applying this classifier to longitudinal and cross-platform data, we see a marked rise in PDD over three decades, higher prevalence on social media versus parliamentary debate, greater use by male than female politicians, and stronger tendencies among right-leaning actors - with pronounced spikes during election campaigns and major political events. Our findings demonstrate the feasibility and value of automated PDD analysis for understanding democratic discourse.


Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking

arXiv.org Artificial Intelligence

Current automated fact-checking (AFC) approaches commonly evaluate evidence either implicitly via the predicted verdicts or by comparing retrieved evidence with a predefined closed knowledge source, such as Wikipedia. However, these methods suffer from limitations, resulting from their reliance on evaluation metrics developed for different purposes and constraints imposed by closed knowledge sources. Recent advances in natural language generation (NLG) evaluation offer new possibilities for evidence assessment. In this work, we introduce Ev2R, an evaluation framework for AFC that comprises three types of approaches for evidence evaluation: reference-based, proxy-reference, and reference-less. We evaluate their effectiveness through agreement with human ratings and adversarial tests, and demonstrate that prompt-based scorers, particularly those leveraging LLMs and reference evidence, outperform traditional evaluation approaches.


Evaluating Transparency of Machine Generated Fact Checking Explanations

arXiv.org Artificial Intelligence

An important factor when it comes to generating fact-checking explanations is the selection of evidence: intuitively, high-quality explanations can only be generated given the right evidence. In this work, we investigate the impact of human-curated vs. machine-selected evidence for explanation generation using large language models. To assess the quality of explanations, we focus on transparency (whether an explanation cites sources properly) and utility (whether an explanation is helpful in clarifying a claim). Surprisingly, we found that large language models generate similar or higher quality explanations using machine-selected evidence, suggesting carefully curated evidence (by humans) may not be necessary. That said, even with the best model, the generated explanations are not always faithful to the sources, suggesting further room for improvement in explanation generation for fact-checking.


Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation

arXiv.org Artificial Intelligence

Very large language models (LLMs) perform extremely well on a spectrum of NLP tasks in a zero-shot setting. However, little is known about their performance on human-level NLP problems which rely on understanding psychological concepts, such as assessing personality traits. In this work, we investigate the zero-shot ability of GPT-3 to estimate the Big 5 personality traits from users' social media posts. Through a set of systematic experiments, we find that zero-shot GPT-3 performance is somewhat close to an existing pre-trained SotA for broad classification upon injecting knowledge about the trait in the prompts. However, when prompted to provide fine-grained classification, its performance drops to close to a simple most frequent class (MFC) baseline. We further analyze where GPT-3 performs better, as well as worse, than a pretrained lexical model, illustrating systematic errors that suggest ways to improve LLMs on human-level NLP tasks.


Sinhala Sentence Embedding: A Two-Tiered Structure for Low-Resource Languages

arXiv.org Artificial Intelligence

In the process of numerically modeling natural languages, developing language embeddings is a vital step. However, it is challenging to develop functional embeddings for resource-poor languages such as Sinhala, for which sufficiently large corpora, effective language parsers, and any other required resources are difficult to find. In such conditions, the exploitation of existing models to come up with an efficacious embedding methodology to numerically represent text could be quite fruitful. This paper explores the effectivity of several one-tiered and two-tiered embedding architectures in representing Sinhala text in the sentiment analysis domain. With our findings, the two-tiered embedding architecture where the lower-tier consists of a word embedding and the upper-tier consists of a sentence embedding has been proven to perform better than one-tier word embeddings, by achieving a maximum F1 score of 88.04% in contrast to the 83.76% achieved by word embedding models. Furthermore, embeddings in the hyperbolic space are also developed and compared with Euclidean embeddings in terms of performance. A sentiment data set consisting of Facebook posts and associated reactions have been used for this research. To effectively compare the performance of different embedding systems, the same deep neural network structure has been trained on sentiment data with each of the embedding systems used to encode the text associated.


How we used machine learning to cover the Australian election

#artificialintelligence

During the last Australian election we ran an ambitious project that tracked campaign spending and political announcements by monitoring the Facebook pages of every major party politician and candidate. The project, dubbed the "pork-o-meter" (after the term pork-barreling), was hugely successful in being able to identify distinct patterns of spending based on vote margin, or incumbent party, with marginal electorates receiving billions of dollars more in campaign promises than other electorates. All up, we processed 34,061 Facebook posts, 2,452 media releases, and published eight stories (eg here, here and here) in addition to an interactive feature. We also used the same Facebook data to analyse photos posted during the campaign to break down the most common types of photo ops for each party, and how things have changed since the 2016 election. We were able to discover more than 1,600 election promises, amounting to tens of billions of dollars in potential spending.


Sentiment Analysis with Deep Learning Models: A Comparative Study on a Decade of Sinhala Language Facebook Data

arXiv.org Artificial Intelligence

The relationship between Facebook posts and the corresponding reaction feature is an interesting subject to explore and understand. To achieve this end, we test state-of-the-art Sinhala sentiment analysis models against a data set containing a decade worth of Sinhala posts with millions of reactions. For the purpose of establishing benchmarks and with the goal of identifying the best model for Sinhala sentiment analysis, we also test, on the same data set configuration, other deep learning models catered for sentiment analysis. In this study we report that the 3 layer Bidirectional LSTM model achieves an F1 score of 84.58% for Sinhala sentiment analysis, surpassing the current state-of-the-art model; Capsule B, which only manages to get an F1 score of 82.04%. Further, since all the deep learning models show F1 scores above 75% we conclude that it is safe to claim that Facebook reactions are suitable to predict the sentiment of a text.


New Croatian Restaurant Uses Five GammaChef Robots to Make Meals

#artificialintelligence

Typically when we write about food making robots, they fall into either one of two categories: Smaller countertop devices meant for the home, or larger, more industrial robots meant for restaurant kitchens. But a restaurant called Bots&Pots in Zagreb, Croatia, is combining those two ideas and using a number of GammaChef cooking robots to make meals for its customers. GammaChef, also based in Croatia (and also a former Smart Kitchen Summit Startup Showcase finalist), makes the eponymous robot capable of creating one-pot dishes such as stews, risottos and pastas. The device stores ingredients, dispenses them into the pot, and stirs the food as it cooks. According to Total Croatia News, customers at Bots&Pots choose their meal via touchscreen at one of five GammaChefs inside the restaurant and they'll be able to see their meal prepared.


Fact check: Facebook didn't pull the plug on two chatbots because they created a language

USATODAY - Tech Top Stories

It's hard to escape artificial intelligence. From algorithms curating social media feeds to personal assistants on smartphones and home devices, AI has become part of everyday life for millions of people across the world. The future of that human-tech relationship may one day involve AI systems being able to learn entirely on their own, becoming more efficient, self-supervised and integrated within a variety of applications and professions. But some on social media claim this evolution toward AI autonomy has already happened. "Facebook recently shut down two of its AI robots named Alice & Bob after they started talking to each other in a language they made up," reads a graphic shared July 18 by the Facebook group Scary Stories & Urban Legends.